Communication-Computation Efficient Gradient Coding
نویسندگان
چکیده
This paper develops coding techniques to reduce the running time of distributed learning tasks. It characterizes the fundamental tradeoff to compute gradients (and more generally vector summations) in terms of three parameters: computation load, straggler tolerance and communication cost. It further gives an explicit coding scheme that achieves the optimal tradeoff based on recursive polynomial constructions, coding both across data subsets and vector components. As a result, the proposed scheme allows to minimize the running time for gradient computations. Implementations are made on Amazon EC2 clusters using Python with mpi4py package. Results show that the proposed scheme maintains the same generalization error while reducing the running time by 32% compared to uncoded schemes and 23% compared to prior coded schemes focusing only on stragglers (Tandon et al., ICML 2017).
منابع مشابه
Variance-based Gradient Compression for Efficient Distributed Deep Learning
Due to the substantial computational cost, training state-of-the-art deep neural networks for large-scale datasets often requires distributed training using multiple computation workers. However, by nature, workers need to frequently communicate gradients, causing severe bottlenecks, especially on lower bandwidth connections. A few methods have been proposed to compress gradient for efficient c...
متن کاملAn efficient secure channel coding scheme based on polar codes
In this paper, we propose a new framework for joint encryption encoding scheme based on polar codes, namely efficient and secure joint secret key encryption channel coding scheme. The issue of using new coding structure, i.e. polar codes in Rao-Nam (RN) like schemes is addressed. Cryptanalysis methods show that the proposed scheme has an acceptable level of security with a relatively smaller ke...
متن کاملGradient Sparsification for Communication-Efficient Distributed Optimization
Modern large scale machine learning applications require stochastic optimization algorithms to be implemented on distributed computational architectures. A key bottleneck is the communication overhead for exchanging information such as stochastic gradients among different workers. In this paper, to reduce the communication cost we propose a convex optimization formulation to minimize the coding...
متن کاملAn Efficient Parallel Discrete
We present a parallel iterative solver for discrete second order elliptic PDEs. It is based on the conjugate gradient algorithm with incomplete factorization preconditioning, using a domain decomposed ordering to allow parallelism in the triangular solves, and resorting to some special recently developed parallelization technique to avoid communication bottleneck for the computation associated ...
متن کاملParallel Execution Time Analysis for Least Squares Problems on Distributed Memory Architectures
In this paper we study the parallelization of PCGLS, a basic iterative method which main idea is to organize the computation of conjugate gradient method with preconditioner applied to normal equations. Two important schemes are discussed. What is the best possible data distribution and which communication network topology is most suitable for solving least squares problems on massively paralle...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1802.03475 شماره
صفحات -
تاریخ انتشار 2018